Albemarle County
Diagnostics for Individual-Level Prediction Instability in Machine Learning for Healthcare
Miller, Elizabeth W., Blume, Jeffrey D.
In healthcare, predictive models increasingly inform patient-level decisions, yet little attention is paid to the variability in individual risk estimates and its impact on treatment decisions. For overparameterized models, now standard in machine learning, a substantial source of variability often goes undetected. Even when the data and model architecture are held fixed, randomness introduced by optimization and initialization can lead to materially different risk estimates for the same patient. This problem is largely obscured by standard evaluation practices, which rely on aggregate performance metrics (e.g., log-loss, accuracy) that are agnostic to individual-level stability. As a result, models with indistinguishable aggregate performance can nonetheless exhibit substantial procedural arbitrariness, which can undermine clinical trust. We propose an evaluation framework that quantifies individual-level prediction instability by using two complementary diagnostics: empirical prediction interval width (ePIW), which captures variability in continuous risk estimates, and empirical decision flip rate (eDFR), which measures instability in threshold-based clinical decisions. We apply these diagnostics to simulated data and GUSTO-I clinical dataset. Across observed settings, we find that for flexible machine-learning models, randomness arising solely from optimization and initialization can induce individual-level variability comparable to that produced by resampling the entire training dataset. Neural networks exhibit substantially greater instability in individual risk predictions compared to logistic regression models. Risk estimate instability near clinically relevant decision thresholds can alter treatment recommendations. These findings that stability diagnostics should be incorporated into routine model validation for assessing clinical reliability.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
- Asia > Middle East > Saudi Arabia (0.04)
- Asia > India > Maharashtra > Mumbai (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
- Information Technology > Data Science > Data Mining (0.85)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
SILENCE: Lightweight Protection for Privacy in Offloaded Speech Understanding
Speech serves as a ubiquitous input interface for embedded mobile devices. Cloud-based solutions, while offering powerful speech understanding services, raise significant concerns regarding user privacy. To address this, disentanglement-based encoders have been proposed to remove sensitive information from speech signals without compromising the speech understanding functionality. However, these encoders demand high memory usage and computation complexity, making them impractical for resource-constrained wimpy devices. Our solution is based on a key observation that speech understanding hinges on long-term dependency knowledge of the entire utterance, in contrast to privacy-sensitive elements that are short-term dependent. Exploiting this observation, we propose SILENCE, a lightweight system that selectively obscuring short-term details, without damaging the long-term dependent speech understanding performance. The crucial part of SILENCE is a differential mask generator derived from interpretable learning to automatically configure the masking process. We have implemented SILENCE on the STM32H7 microcontroller and evaluate its efficacy under different attacking scenarios. Our results demonstrate that SILENCE offers speech understanding performance and privacy protection capacity comparable to existing encoders, while achieving up to 53.3 speedup and 134.1 reduction in memory footprint.
- Asia > China > Guangdong Province > Shenzhen (0.05)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- North America > United States > California > Los Angeles County > Los Angeles (0.30)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
- Europe > France (0.05)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Middle East > Israel (0.04)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > Middle East > Jordan (0.05)
- (3 more...)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > France (0.04)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.15)
- Asia > Singapore (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
MedSafetyBench: Evaluating and Improving the Medical Safety of Large Language Models
However, there is little to no understanding of the notion of medical safety in the context of LLMs, let alone how to evaluate and improve it. To address this gap, we first define the notion of medical safety in LLMs based on the Principles of Medical Ethics set forth by the American Medical Association.
- Europe > United Kingdom (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)